Advertisement: Support JavaWorld, click here!
April 1999
HOME FEATURED TUTORIALS COLUMNS NEWS & REVIEWS FORUM JW RESOURCES ABOUT JW






ARCHIVE

TOPICAL INDEX
Core Java
Enterprise Java
Micro Java
Applied Java
Java Community

JAVA Q&A INDEX

JAVA TIPS INDEX

JavaWorld Services

Free JavaWorld newsletters

ProductFinder

Education Resources

White Paper Library

NEW! Rational Resources


XML for the absolute beginner

A guided tour from HTML to processing XML with Java


Printer-friendly version Printer-friendly version | Send this article to a friend Mail this to a friend


Page 8 of 10

Advertisement

Modeling information structure in XML
So far, we've looked at XML as a way of representing data as human-readable documents, and we've spent some time discussing formatting. But XML's real power is in its ability to represent information structure -- how various pieces of information relate to one another -- in much the same way a database might.

Structured documents of the type we've been looking at have the property that all of their elements nest inside one another, as in Listing 5 above. Instead of looking at a document as a file, though, consider what happens if we look at the structure of the tags as a tree:


Figure 3. The recipe represented as a tree structure

The figure above shows the recipe as a tree of document tags. The child nodes of a document nest within the parent node. What if there were a way to automagically convert an XML document into a tree of objects in a programming language -- like, oh, say, Java maybe? And what if these objects all had properties that could be set and retrieved -- such as the list of each element's children, the text each object contained, and so on. Wouldn't that be interesting?

The Document Object Model (DOM) Level 1 Recommendation (see Resources), created by a W3C committee, describes a set of language-neutral interfaces capable of representing any well-formed XML or HTML document.

With the DOM, HTML and XML documents can be manipulated as objects, instead of just as streams of text. In fact, from the DOM point of view, the document is the object tree, and the XML, HTML, or what have you is simply a persistent representation of that tree.

The availability of the DOM makes it much simpler to read and write structured document files, since standard HTML and XML parsers are written to produce DOM trees. If these objects have GUI representations, it's easy to see how to create an application that reads structured document files (XML or HTML), lets the user edit the structure visually, and then save it in its original format. Programs that interface with existing Web sites become much easier to write, because once the document is parsed, you're working with objects native to your programming language.

One of the earliest popular uses for the Document Object Model is Dynamic HTML, where client-side scripts manipulate and display (and redisplay) an HTML document in response to user actions. Dynamic HTML manipulates the client-side document in terms of the scripting language's binding to the DOM structure of the document being displayed. For instance, a <BUTTON> object might, when clicked, reorder a table on the same page by sorting the <TR> (table row) nodes on a particular column.

But aside from all this browers-document-Web technology, the DOM provides a common way of accessing general data structures from structured documents. Any language that has a binding (that is, a specific set of interfaces that implement the DOM in that language) can use XML as an interface for storing, retrieving, and processing generic hierarchical (and even nonhierarchical) object structures.

How DOM and XML work together
The DOM opens the door to using XML as the lingua franca of data interchange on the Internet, and even within applications. Tim Berners-Lee, discussed earlier and commonly known as the "inventor of the World Wide Web," says that, these days, it's important to understand that if a system you're designing survives, it will someday be used as a module in another system. So it's best to design it that way from the start. The DOM is completely described in IDL, the Interface Definition Language used in CORBA, so it's connected to existing software interoperation standards.

Let's think a moment about how DOM with XML would be useful in programming a database system. First, represent your database schema as a set of DOM objects. Want a document that describes that schema? No problem: write it out as XML. Use XSL to format the XML as HTML and you've got a complete, browseable schema reference that's always up to date. Want to automatically construct SQL for updating your relational database from a record set coming into your system? Just traverse your database's (DOM) schema tree, matching up the names of the columns from the record set with those of the schema, and build an SQL UPDATE statement as you go. What's that you say? The schema has changed, and the record set you've received doesn't match up with the new schema? You can write code to handle that, or present the user with error messages that state exactly what's wrong. You even might be able to use XSL to refactor the DOM tree of your record set into something matching the new schema.

Finally, it's time to start programming in Java! In the next section, we're going to examine the Java bindings of the DOM and see how to use the DOM in a Java program.


Next page >
Page 1 XML for the absolute beginner
Page 2 HTML: All form and no substance
Page 3 An XML conceptual example
Page 4 Make up a markup
Page 5 So, what good is made-up markup?
Page 6 Cascading Style Sheets: not just for HTML anymore
Page 7 XSL: I like your style
Page 8 Modeling information structure in XML
Page 9 XML and Java
Page 10 Become a tree surgeon!

Printer-friendly version Printer-friendly version | Send this article to a friend Mail this to a friend



Advertisement: Support JavaWorld, click here!


HOME |  FEATURED TUTORIALS |  COLUMNS |  NEWS & REVIEWS |  FORUM |  JW RESOURCES |  ABOUT JW |  FEEDBACK

Copyright © 2003 JavaWorld.com, an IDG company